Site Reliability Engineer
Description
Site Reliability Engineer
Mumbai, India
EGNYTE YOUR CAREER. SPARK YOUR PASSION.
Egnyte is a place where we spark opportunities for amazing people. We believe that every role has meaning, and every Egnyter should be respected. With 22,000+ customers worldwide and growing, you can make an impact by protecting their valuable data. When joining Egnyte, you’re not just landing a new career, you become part of a team of Egnyters that are doers, thinkers, and collaborators who embrace and live by our values:
Invested Relationships
Fiscal Prudence
Candid Conversations
ABOUT EGNYTE
Egnyte is the secure multi-cloud platform for content security and governance that enables organizations to better protect and collaborate on their most valuable content. Established in 2008, Egnyte has democratized cloud content security for more than 22,000 organizations, helping customers improve data security, maintain compliance, prevent and detect ransomware threats, and boost employee productivity on any app, any cloud, anywhere. For more information, visit www.egnyte.com.
Right now, we are looking for a Site Reliability Engineer. You will be ensuring reliability for large-scale software - we’re talking 22k+ customers, over 6000 instances across geo-distributed Data Centers and Cloud providers, as well as an average of 2k API requests per second as per New Relic. People who own their work from start to finish are integral to Egnyte’s success. Our engineers are part of the whole process: from design through coding and testing to the deployment and back again for further iterations. We are looking for a mid-level engineer eager to apply software development approaches to operations. You can, and will, touch every infrastructure level depending on the day and the project you are working on.
WHAT YOU’LL DO:
- Maintain and monitor our environments in a 24/7 rotation system, partial night shift coverage
- Improve our monitoring systems, identify repetitive tasks
- Cooperate with international teams
- Identify performance challenges
- Document and communicate progress on resolving issues
YOUR QUALIFICATIONS:
- Experience in an SRE/SysAdmin/DevOps or equivalent role - at least +4 years
- Practical experience in managing Linux Operating Systems on the administrative level
- Solid Monitoring & DevOps skills
- Practical knowledge of container orchestration (Kubernetes, Docker)
- Familiarity with at least one of the monitoring tools (e.g. Icinga, Newrelic, Prometheus, Grafana, OpenTSDB)
- Experience with public cloud services (GCP/AWS/Azure)
- Coding skills in Python or Golang
- Ability to work effectively in a globally distributed team structure
- Drive to grow as a Site Reliability Engineer (we value open-mindedness and a can-do attitude)
- Troubleshooting skills to hunt down the root causes of issues and persistence in preventing them from happening again
- Experience handling large numbers of diverse systems with configuration management systems like Puppet, Ansible, Terraform
- Solid English skills to effectively communicate with other team members (B2 level)
BONUS SKILLS:
- Practical Experience using CI/CD tools like Jenkins.
- Incident management experience
- Experience with Linux HA solutions such as HAProxy
BENEFITS:
- Competitive salaries
- Company equity depending on role and level
- Medical insurance and healthcare benefits for you and your family
- Fully paid premiums for life insurance
- Flexible hours and PTO
- Mental wellness platform subscription
- Gym reimbursement
- Childcare reimbursement
- Group term life insurance
COMMITMENT TO DIVERSITY, EQUITY, AND INCLUSION:
At Egnyte, we celebrate our differences and thrive on our diversity for our employees, our products, our customers, our investors, and our communities. Egnyters are encouraged to bring their whole selves to work and to appreciate the many differences that collectively make Egnyte a higher-performing company and a great place to be.